Tree Insertion Grammar: A Cubic-Time Parsable Formalism That Lexicalizes Context-Free Grammar Without Changing the Trees Produced
نویسندگان
چکیده
Tree insertion grammar (TIG) is a tree-based formalism that makes use of tree substitution and tree adjunction. TIG is related to tree adjoining grammar. However, the adjunction permitted in TIG is su ciently restricted that TIGs only derive context free languages and TIGs have the same cubic-time worst-case complexity bounds for recognition and parsing as context free grammars. An e cient Earley-style parser for TIGs is presented. Any context free grammar (CFG) can be converted into a lexicalized tree insertion grammar (LTIG) that generates the same trees. A constructive procedure is presented for converting a CFG into a left anchored (i.e., word initial) LTIG that preserves ambiguity and generates the same trees. The LTIG created can be represented very compactly by taking advantage of sharing between the elementary trees in it. Other methods for converting CFGs into a left anchored form, e.g., the methods of Greibach and Rosenkrantz, do not preserve the trees produced and result in very large output grammars. For the purpose of experimental evaluation, the LTIG lexicalization procedure was applied to eight di erent CFGs for subsets of English. The LTIGs created were smaller than the original CFGs. Using an implementation of the Earley-style TIG parser that was specialized for left anchored LTIGs, it was possible to parse more quickly with these LTIGs than with the original CFGs. This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonpro t educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories of Cambridge, Massachusetts; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories. All rights reserved. Copyright c Mitsubishi Electric Research Laboratories, 1994 201 Broadway, Cambridge, Massachusetts 02139
منابع مشابه
Tree Insertion Grammar: Cubic-Time, Parsable Formalism that Lexicalizes Context-Free Grammar without Changing the Trees Produced
Tree insertion grammar (TIG) is a tree-based formalism that makes use of tree substitution and tree adjunction. TIG is related to tree adjoining grammar. However, the adjunction permitted in TIG is sufficiently restricted that TIGs only derive context-free languages and TIGs have the same cubic-time worst-case complexity bounds for recognition and parsing as context-free grammars. An efficient ...
متن کاملA Cubic-Time Parsable, Lexicalized Normal Form For Context-Free Grammar That Preserves Tree Structure
Lexicalized context-free grammar (LCFG) is a tree-based formalism that makes use of both tree substitution and a restricted form of tree adjunction. Because of its use of adjunction, LCFG allows su cient freedom in the way derivations can be performed that lexicalization of context-free grammars (CFGs) is possible while preserving the structure of the trees derived by the CFGs. However, the tre...
متن کاملAnálisis sintáctico combinado de gramáticas de adjunción de árboles y de gramáticas de inserción de árboles
Adjunction is a powerful operation that makes Tree Adjoining Grammar (TAG) useful for describing the syntactic structure of natural languages. In practice, a large part of wide coverage grammars written following the TAG formalism is formed by trees that can be combined by means of the simpler kind of adjunction defined for Tree Insertion Grammar. In this article, we describe a parsing algorith...
متن کاملCapturing CFLs with Tree Adjoining Grammars
We define a decidable class of TAGs that is strongly equivalent to CFGs and is cubic-time parsable. This class serves to lexicalize CFGs in the same manner as the LC, FGs of Schabes and Waters but with considerably less restriction on the form of the grammars . The class provides a nornlal form for TAGs that generate local sets m rnuch the same way that regular g rammars provide a normal form f...
متن کاملMixed Parsing of Tree Insertion and Tree Adjoining Grammars
Adjunction is a powerful operation that makes Tree Adjoining Grammar (TAG) useful for describing the syntactic structure of natural languages. In practice, a large part of wide coverage grammars written following the TAG formalism is formed by trees that can be combined by means of the simpler kind of adjunction defined for Tree Insertion Grammar. In this paper, we describe a parsing algorithm ...
متن کامل